Topological Machine Learning for Mixed Numeric and Categorical Data

نویسندگان

چکیده

Topological data analysis is a relatively new branch of machine learning that excels in studying high-dimensional data, and theoretically known to be robust against noise. Meanwhile, objects with mixed numeric categorical attributes are ubiquitous real-world applications. However, topological methods usually applied point cloud the best our knowledge there no available framework for classification using methods. In this paper, we propose novel method classification. proposed method, use theory from such as persistent homology, persistence diagrams Wasserstein distance study data. The performance demonstrated by experiments on heart disease dataset. Experimental results show outperforms several state-of-the-art algorithms prediction disease.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Large Data Sets with Mixed Numeric and Categorical Values

Efficient partitioning of large data sets into homogenous clusters is a fundamental problem in data mining. The standard hierarchical clustering methods provide no solution for this problem due to their computational inefficiency. The k-means based methods are promising for their efficiency in processing large data sets. However, their use is often limited to numeric data. In this paper we pres...

متن کامل

Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach

Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes. However, datasets with mixed types of attributes are common in real life data mining applications. In this paper, we propose a novel divide-and-conquer techniq...

متن کامل

An improved k-prototypes clustering algorithm for mixed numeric and categorical data

Data objects with mixed numeric and categorical attributes are commonly encountered in real world. The k-prototypes algorithm is one of the principal algorithms for clustering this type of data objects. In this paper, we propose an improved k-prototypes algorithm to cluster mixed data. In our method, we first introduce the concept of the distribution centroid for representing the prototype of c...

متن کامل

Clustering Algorithm for Incomplete Data Sets with Mixed Numeric and Categorical Attributes

The traditional k-prototypes algorithm is well versed in clustering data with mixed numeric and categorical attributes, while it is limited to complete data. In order to handle incomplete data set with missing values, an improved k-prototypes algorithm is proposed in this paper, which employs a new dissimilarity measure for incomplete data set with mixed numeric and categorical attributes and a...

متن کامل

A k-mean clustering algorithm for mixed numeric and categorical data

Use of traditional k-mean type algorithm is limited to numeric data. This paper presents a clustering algorithm based on k-mean paradigm that works well for data with mixed numeric and categorical features. We propose new cost function and distance measure based on co-occurrence of values. The measures also take into account the significance of an attribute towards the clustering process. We pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal on Artificial Intelligence Tools

سال: 2021

ISSN: ['1793-6349', '0218-2130']

DOI: https://doi.org/10.1142/s0218213021500251